Natural Language Processing for Computer - Mediated

نویسندگان

  • Michael Beißwenger
  • Torsten Zesch
چکیده

This paper presents the DiDi Corpus, a corpus of South Tyrolean Data of Computermediated Communication (CMC). The corpus comprises around 650,000 tokens from Facebook wall posts, comments on wall posts and private messages, as well as socio-demographic data of participants. All data was automatically annotated with language information (de, it, en and others), and manually normalised and anonymised. Furthermore, semi-automatic token level annotations include part-of-speech and CMC phenomena (e.g. emoticons, emojis, and iteration of graphemes and punctuation). The anonymised corpus without the private messages is freely available for researchers; the complete and anonymised corpus is available after signing a nondisclosure agreement.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Generalized Language Model for Question Matching

Question and answering service is one of the popular services in the World Wide Web. The main goal of these services is to finding the best answer for user's input question as quick as possible. In order to achieve this aim, most of these use new techniques foe question matching. . We have a lot of question and answering services in Persian web, so it seems that developing a question matching m...

متن کامل

Effects of CALL-Mediated TBLT on Self-Efficacy for Reading among Iranian University Non-English Major EFL Students

The rich and still expanding literature on TBLT is helping to mature both its theoretical conceptualization and practical implementation in foreign and second language education. Similarly, computer-assisted language learning (CALL) has grown as a field, with the use and integration of technology in the classroom continuing to increase and will continue to play an important role in this maturat...

متن کامل

Effects of CALL-Mediated TBLT on Self-Efficacy for Reading among Iranian University Non-English Major EFL Students

The rich and still expanding literature on TBLT is helping to mature both its theoretical conceptualization and practical implementation in foreign and second language education. Similarly, computer-assisted language learning (CALL) has grown as a field, with the use and integration of technology in the classroom continuing to increase and will continue to play an important role in this maturat...

متن کامل

Learning Pragmatics through Computer-Mediated Communication in Taiwan

This study investigated the effectiveness of explicit pragmatic instruction on the acquisition of requests by college-level English as Foreign Language (EFL) learners in Taiwan. The goal was to determine first whether the use of explicit pragmatic instruction had a positive effect on EFL learners’ pragmatic competence. Second, the relative effectiveness of presenting pragmatics through two deli...

متن کامل

Applying Computer-Mediated Active Learning Intervention to Improve L2 Listening Comprehension

: This study aims to apply active learning in a foreign language context to improve L2 learners’ listening comprehension. Participants in this attempt were 56 EFL learners between 13 and 15 years old. To amass the required data, learners went through a ten-week treatment, in which participants in the experimental group received computer-mediated active learning intervention and...

متن کامل

Designing and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods

For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015